智能论文笔记

BED: A Real-Time Object Detection System for Edge Devices

Guanchu Wang , Zaid Pervaiz Bhat , Zhimeng Jiang , Yi-Wei Chen , Daochen Zha , Alfredo Costilla Reyes , Afshin Niktash , Gorkem Ulkar , Erman Okman , Xia Hu

分类：计算机视觉 | 人工智能 | 机器学习

2022-02-14

在边缘设备上部署深层神经网络〜（DNNS）为现实世界任务提供了有效的解决方案。边缘设备已用于在不同域中有效地收集大量数据。DNN是用于数据处理和分析的有效工具。但是，由于计算资源和内存有限，在边缘设备上设计DNN是具有挑战性的。为了应对这一挑战，我们演示了最大78000 DNN加速器上边缘设备的对象检测系统。它分别与摄像头和用于图像采集和检测展览的LCD显示器集成了启动DNN的推断。床是一种简洁，有效且详细的解决方案，包括模型培训，量化，合成和部署。实验结果表明，床可以通过300 kb微小的DNN模型产生准确的检测，该模型仅需91.9 ms的推理时间和1.845 MJ的能量。

translated by 谷歌翻译

AutoVideo: An Automated Video Action Recognition System

Daochen Zha , Zaid Pervaiz Bhat , Yi-Wei Chen , Yicheng Wang , Sirui Ding , Jiaben Chen , Kwei-Herng Lai , Mohammad Qazim Bhat , Anmoll Kumar Jain , Alfredo Costilla Reyes

分类：计算机视觉 | 机器学习

2021-08-09

动作识别是通过广泛应用程序进行视频理解的重要任务。但是，开发有效的动作识别解决方案通常需要进行广泛的工程工作，以构建和测试模块及其超参数的不同组合。在此演示中，我们提出了Autovideo，这是一种用于自动视频动作识别的Python系统。Autovideo的特征是1）标准管道语言之后的高度模块化和可扩展的基础架构，2）管道构造的原始列表，3）数据驱动的调谐器来保存管道调整的努力，4）易于使用图形用户界面（GUI）。Autovideo在MIT许可证上发行，网址为https://github.com/datamllab/autovideo

translated by 谷歌翻译

Double U-Net for Super-Resolution and Segmentation of Live Cell Images

Mayur Bhandary , J. Patricio Reyes , Eylul Ertay , Aman Panda

分类：计算机视觉

2022-12-05

Accurate segmentation of live cell images has broad applications in clinical and research contexts. Deep learning methods have been able to perform cell segmentations with high accuracy; however developing machine learning models to do this requires access to high fidelity images of live cells. This is often not available due to resource constraints like limited accessibility to high performance microscopes or due to the nature of the studied organisms. Segmentation on low resolution images of live cells is a difficult task. This paper proposes a method to perform live cell segmentation with low resolution images by performing super-resolution as a pre-processing step in the segmentation pipeline.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Towards Asteroid Detection in Microlensing Surveys with Deep Learning

Preeti Cowan , Ian A. Bond , Napoleon H. Reyes

分类：计算机视觉 | 机器学习

2022-11-04

Asteroids are an indelible part of most astronomical surveys though only a few surveys are dedicated to their detection. Over the years, high cadence microlensing surveys have amassed several terabytes of data while scanning primarily the Galactic Bulge and Magellanic Clouds for microlensing events and thus provide a treasure trove of opportunities for scientific data mining. In particular, numerous asteroids have been observed by visual inspection of selected images. This paper presents novel deep learning-based solutions for the recovery and discovery of asteroids in the microlensing data gathered by the MOA project. Asteroid tracklets can be clearly seen by combining all the observations on a given night and these tracklets inform the structure of the dataset. Known asteroids were identified within these composite images and used for creating the labelled datasets required for supervised learning. Several custom CNN models were developed to identify images with asteroid tracklets. Model ensembling was then employed to reduce the variance in the predictions as well as to improve the generalisation error, achieving a recall of 97.67%. Furthermore, the YOLOv4 object detector was trained to localize asteroid tracklets, achieving a mean Average Precision (mAP) of 90.97%. These trained networks will be applied to 16 years of MOA archival data to find both known and unknown asteroids that have been observed by the survey over the years. The methodologies developed can be adapted for use by other surveys for asteroid recovery and discovery.

translated by 谷歌翻译

EDO-Net: Learning Elastic Properties of Deformable Objects from Graph Dynamics

Alberta Longhini , Marco Moletta , Alfredo Reichlin , Michael C. Welle , David Held , Zackory Erickson , Danica Kragic

分类：计算机视觉 | 人工智能 | 机器人

2022-09-19

我们研究了可变形对象的学习图动力学问题，这些动力学将其推广到未知物理特性。特别是，我们利用了像布状可变形物体的弹性物理特性的潜在表示，我们通过拉动相互作用探索。我们提出了EDO-NET（弹性可变形物体 - NET），该模型在具有不同弹性特性的各种样品上以自我监督的方式训练。EDO-NET共同学习了一个适应模块，负责提取对象物理特性的潜在表示，以及一个前向动力学模块，该模块利用潜在的表示来预测类似布的对象的未来状态，表示为图形。我们在模拟和现实世界中评估了江户网 - 评估其功能的：1）概括为布状可变形物体的未知物理特性，2）将学习的表示形式转移到新的下游任务。

translated by 谷歌翻译

COMPASS: A Formal Framework and Aggregate Dataset for Generalized Surgical Procedure Modeling

Kay Hutchinson , Ian Reyes , Zongyu Li , Homa Alemzadeh

分类：机器人

2022-09-14

目的：我们提出了一个正式的框架，用于使用统一的运动原始图（MPS）作为基本手术动作来建模手术任务，以实现不同数据集的更客观的标记和聚集，并培训通用模型，以实现手术动作识别。方法：我们使用我们的框架来创建上下文和运动原始骨料外科手术集（指南针），包括来自三个公共可用数据集（拼图，桌子，桌子和Rosma）的六个干燥LAB手术任务标签。提出了标记手术环境和自动转换为MPS的方法。我们提出了一项任务（Loto）交叉验证方法，以评估模型概括为看不见的任务的能力。结果：我们的上下文标签方法达到了众包的共识标签与专家外科医生之间的几乎完美的一致性。对MPS的任务进行分割，可以生成单独的左右笔录，并显着改善Loto的性能。我们发现，如果对具有相同上下文的任务和/或来自同一数据集的任务进行了培训，则MP细分模型的性能最佳。结论：所提出的框架可以基于上下文和细粒度的MPS对外科数据进行高质量的标记。使用MPS对外科手术任务进行建模可以使不同数据集的汇总用于训练动作识别模型，这些模型可以比在手势级别训练的模型更好地概括地看不见的任务。意义：我们的正式框架和汇总数据集可以支持用于手术过程分析，技能评估，错误检测和自治的模型和算法的开发。

translated by 谷歌翻译

Elastic Context: Encoding Elasticity for Data-driven Models of Textiles

Alberta Longhini , Marco Moletta , Alfredo Reichlin , Michael C. Welle , Alexander Kravberg , Yufei Wang , David Held , Zackory Erickson , Danica Kragic

分类：机器人

2022-09-12

与纺织品（例如辅助敷料）的物理互动依赖于先进的灵巧能力。拉扯和拉伸时纺织行为的潜在复杂性是由于纱线材料特性和纺织品构造技术所致。如今，还没有采用和注释的数据集评估各种交互或属性识别方法。影响这种相互作用的一种重要特性是材料弹性是由纱线材料和构造技术引起的：这两个是交织在一起的，如果不知道A-Priori，几乎无法通过在机器人平台上使用常见的传感来识别。我们介绍了弹性环境（EC），该概念集成了影响弹性行为的各种属性，以使其与纺织品进行更有效的物理互动。 EC的定义依赖于纺织工程中常用的压力/应变曲线，我们为机器人应用重新制定了压力/应变曲线。我们使用图形神经网络（GNN）使用EC来学习纺织品的通用弹性行为。此外，我们探讨了EC对非线性现实世界弹性行为的准确力量建模的影响，从而强调了当前机器人设置以感知纺织特性的挑战。

translated by 谷歌翻译

Inference and Learning for Generative Capsule Models

Alfredo Nazabal , Nikolaos Tsagkas , Christopher K. I. Williams

分类：机器学习 | 计算机视觉

2022-09-07

胶囊网络（参见例如Hinton等，2018）旨在编码有关对象及其部分之间关系的知识和理由。在本文中，我们为此类数据指定了一个生成模型，并得出了一种用于推断场景中每个模型对象转换的变异算法以及观察到的部分对对象的分配。我们基于变异期望最大化来得出对象模型的学习算法（Jordan等，1999）。我们还根据Fischler和Bolles（1981）的RANSAC方法研究了一种替代推理算法。我们将这些推理方法应用于（i）从正方形和三角形（“星座”）等多个几何对象生成的数据，以及（ii）基于零件的面部模型的数据。 Kosiorek等人的最新工作。（2019年）通过堆叠的胶囊自动编码器（SCAE）使用摊销推理来解决此问题 - 我们的结果表明，我们在可以进行比较的地方（在星座数据上）大大优于它们。

translated by 谷歌翻译

Event-based Image Deblurring with Dynamic Motion Awareness

Patricia Vitoria , Stamatios Georgoulis , Stepan Tulyakov , Alfredo Bochicchio , Julius Erbach , Yuanyou Li

分类：计算机视觉

2022-08-24

由于模糊图像本身缺乏时间和纹理信息，因此非均匀的图像脱毛是一项具有挑战性的任务。来自辅助传感器的互补信息正在探索这些事件传感器以解决这些限制。后者可以异步记录对数强度的变化，称为事件，具有高时间分辨率和高动态范围。当前的基于事件的脱蓝晶方法将模糊图像与事件结合在一起，以共同估计每个像素运动和DeBlur操作员。在本文中，我们认为一种分裂和争议的方法更适合此任务。为此，我们建议使用调制可变形的卷积，其内核偏移和调制掩模是从事件中动态估算的，以编码场景中的运动，而从模糊图像和相应事件的组合中学习了deblur操作员。此外，我们采用了一种粗到十的多尺度重建方法来应对低对比度区域中事件的固有稀疏性。重要的是，我们介绍了第一个数据集，其中包含对曝光时间内的真实RGB模糊图像和相关事件的对。我们的结果在使用事件时显示出更好的总体鲁棒性，在合成数据上，PSNR的改进最多可提高1.57db，而对真实事件数据的改进则提高了1.08 dB。

translated by 谷歌翻译

HTML版本